Glossary

A repository of acronyms, jargon, and useful words for product and customer teams

A

Anomaly Detection

Anomaly detection is the process of finding data points that are outliers from the rest of a data set.

Read full description

Apache Arrow

Apache Arrow a language-agnostic software framework for developing data analytics applications that process columnar data. It contains a standardized column-oriented memory format that is able to represent flat and hierarchical data for efficient analytic operations.

Read full description

Apache Arrow Flight SQL

Apache Arrow Flight SQL provides a high performance SQL interface for working with databases over a network

Read full description

Apache DataFusion

DataFusion is an in-memory query planning, optimization, and execution framework. DataFusion was created in 2017 and donated to the Apache Arrow project in 2019.

Read full description

Apache Parquet

Apache Parquet is an open source columnar data file format that supports different encoding and compression schemes to optimize it for efficient data storage and retrieval in bulk.

Read full description

Application Performance Monitoring (APM)

APM helps developers detect and diagnose application performance issues, allowing them to address problems promptly and maintain a high level of service and user experience.

Read full description

ARIMA

An Autoregressive Integrated Moving Average (ARIMA) model is a widely used time series forecasting technique.

Read full description
B

Batch Processing

Batch processing is a computer processing technique where a large amount of data is collected and processed at once rather than in real time

Read full description
C

CAP Theorem

CAP theorem is a computer science theory related to the tradeoffs involved with designing distributed databases.

Read full description

Cardinality

In the context of databases cardinality is the number of unique sets of data stored in a database. Specifically, it refers to the total number of unique values possible within a table column or database equivalent.

Read full description

Column Database

Column databases are a type of DBMS that store data formatted in columns rather than rows and are optimized for analytics workloads.

Read full description

Continuous Profiling

What is continuous profiling? Learn this technique to identify and resolve potential performance bottlenecks and other issues.

Read full description
D

Data Fabric

Learn about the powerful concept of data fabric and its potential to transform the way organizations manage and utilize their data.

Read full description

Data Governance

Data governance is a framework that establishes the rules, policies, and procedures for organizations to ensure the quality, integrity, security, and usability of their data assets.

Read full description

Data Historian

A Data Historian is a type of software specifically designed for capturing and storing time-series data from industrial operations.

Read full description

Data Lakes Explained

Data lakes serve as a centralized repository, enabling the storage of large volumes of both structured and unstructured data.

Read full description

Data Mesh

What is continuous profiling? Learn this technique to identify and resolve potential performance bottlenecks and other issues.

Read full description

Data Modeling Explained

Data modeling is the process of creating a representation of an organization or application's data and the relationships between different data points.

Read full description

Data Warehouse

A data warehouse is a data management system that supports business intelligence, such as data analysis. Data warehouses help you make more insightful decisions about your business.

Read full description

Database as a Service (DBaaS)

Database-as-a-service (DBaaS) is a cloud computing service that provides access to a cloud database system without needing to set up, configure, or manage software or physical infrastructure.

Read full description

Database Denormalization

Database denormalization is a process for designing a database to enable faster data reads. One way to do this is by introducing redundant data where necessary.

Read full description

Database Indexing

The index of a database table acts like the index in a physical textbook. On this page, we will learn what database indexing is and more.

Read full description

Database Sharding

Database sharding is a strategy for scaling a database by breaking it into smaller, more manageable pieces.

Read full description

DevSecOps

DevSecOps is an extension of DevOps that's sometimes known as secure DevOps. The main objective is to integrate security into the software development workflow.

Read full description

Digital Twins

A digital twin is a virtual model or replica of a physical object, process, or system that is used for simulation, analysis, and understanding.

Read full description

Distributed Database

Distributed databases system spreads data storage and processing across multiple servers instead of relying on a single server.

Read full description

Distributed Tracing

Tracing is a method for understanding how interconnected components of a distributed system interact with each other.

Read full description
E

Edge Computing

Edge computing is a type of computing that happens near a data source. It allows you to perform computing tasks as close to an IoT device or end user as possible instead of using a data center or the cloud.

Read full description

ETL (Extract, Transform, Load)

ETL stands for Extract, Transform, Load and is the process of moving and manipulating data from different sources before storing it in another database.

Read full description
F

FDAP Stack

The FDAP stack is a set of components used for building high performance data applications.

Read full description

Fog Computing

Fog computing is a type of decentralized computing infrastructure that extends cloud computing capabilities to the edge of an enterprise's network.

Read full description

Fuzz Testing

Fuzz testing, also referred to as fuzzing, is a software testing technique designed to identify vulnerabilities, bugs, and potential crashes in software applications, systems, or protocols.

Read full description
I

Inverted Index

An inverted index is a data structure that is commonly used as a database index to allow for fast full text search.

Read full description

IoT Cloud

IoT cloud is an internet-based cloud service that stores data from IoT devices. This post goes over the benefits of IoT cloud.

Read full description

IoT Devices

The Internet of Things refers to the network of interconnected "things" with sensors, software, processing ability, and other technologies that connect and exchange data with other internet-connected devices. IoT devices include smartphones, medical sensors, fitness trackers, smart security systems, and other technologies.

Read full description

IoT Platform

An IoT platform is crucial in the development and administration of IoT applications. Learn more on this page.

Read full description

IoT Security

In the digital landscape, IoT security plays a pivotal role in ensuring the confidentiality, integrity, and availability of data transmitted and stored across networks.

Read full description
M

MLOps: A Comprehensive Guide to Machine Learning Operations

Learn about the world of MLOps - machine learning operations. Automate workflows, deploy ML models, and learn best practices for deploying AI/ML models in production.

Read full description
N

NoSQL Database

NoSQL databases are a type of database system that provide a mechanism for data storage and retrieval, differing from traditional relational databases which are structured and require data to fit into predefined tables.

Read full description
O

Observability

Software observability is the practice of monitoring your software in order to understand its behavior and identify issues.

Read full description

Online Analytical Processing (OLAP)

Online Analytical Processing (OLAP) is an approach to working with typically multidimensional data for analytics use cases.

Read full description

OPC Unified Architecture (OPC UA)

OPC UA is a cross-platform standard for moving data between sensors and cloud applications.

Read full description

Open Data Architecture

Open data architectures take advantage of modern technologies to improve how organizations work with data at scale.

Read full description

Overall Equipment Effectiveness (OEE)

Overall Equipment Effectiveness (OEE) is a metric for measuring manufacturing productivity.

Read full description
P

Predictive Analytics

Predictive analytics is a form of analytics that tries to predict future events, trends, or behaviors based on historical and present data.

Read full description

Predictive Maintenance

Predictive Maintenance is a maintenance strategy that uses a combination of sensors and data analysis to detect issues in machinery or other equipment before they become major problems.

Read full description

Prometheus metrics

Prometheus stores four metric types for monitoring needs: counters, gauges, histograms, and summaries.

Read full description
R

Real-Time Database

Today, a real-time database can mean different things. This glossary page considers two categories of real-time databases.

Read full description

Real User Monitoring

Real User Monitoring(RUM) is the process of collecting user data to gain insights into your application's performance and how it is being used.

Read full description
S

SCADA (Supervisory Control And Data Acquisition)

SCADA stands for Supervisory Control and Data Acquisition. A SCADA system is usually a collection of both software and hardware components that allow supervision and control of industrial plants.

Read full description

Seasonality

Seasonality is the presence of regular and predictable change in time series data.

Read full description

Security Information and Event Management (SIEM)

Security information and event management (SIEM) is a powerful integration of two security systems: security information management (SIM) and security event management (SEM).

Read full description

SQL

SQL is a domain specific language used in programming and designed for managing data held in a relational database management system.

Read full description

Stationarity

Stationarity refers to a time series where the statistical properties of that series don’t depend on the time when observing it.

Read full description

Stream Processing

Stream processing is a technique to process continuous data (unbounded data) streams where data flows in real time from one point to another, like from a sensor to a database.

Read full description
T

Time Series Plot

Learn about time series plots: their working, types, and different use cases. Also, check how you can implement them in Python language.

Read full description
U

Unstructured Data

Unstructured data refers to information that does not fit into a pre-defined model or structure. It is typically non-relational and often unorganized or incomplete, unlike structured data which fits into well-defined models, such as tables and fields.

Read full description
V

Vector Processing

Vector processing is a computer method that can process numerous data components at once. It operates on every element of the entire vector in one operation.

Read full description